Markov decision processes with fuzzy rewards

نویسندگان

  • Masami Kurano
  • Masami Yasuda
  • Jun-ichi Nakagami
  • Yuji Yoshida
چکیده

In this paper, we consider the model that the information on the rewards in vector-valued Markov decision processes includes imprecision or ambiguity. The fuzzy reward model is analyzed as follows: The fuzzy reward is represented by the fuzzy set on the multi-dimensional Euclidian space R and the infinite horizon fuzzy expected discounted reward(FEDR) from any stationary policy is characterized as a unique fixed point of the corresponding contractive operator. Also, we fined a Pareto optimal policy which maximizes the infinite horizon FEDR over all stationary policies under the pseudo order induced by a convex cone R. As a numerical example, the machine maintenance problem is considered.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fuzzy treatment of uncertain Markov decision processes: Average case

In this paper, the uncertain transition matrices for inhomogeneous Markov decision processes are described by use of fuzzy sets. Introducing a ν-step contractive property, called a minorization condition, for the average case, we fined a Pareto optimal policy maximizing the average expected fuzzy rewards under some partial order. The Pareto optimal policies are characterized by maximal solution...

متن کامل

Lightweight Monte Carlo Verification of Markov Decision Processes with Rewards

Markov decision processes are useful models of concurrency optimisation problems, but are often intractable for exhaustive verification methods. Recent work has introduced lightweight approximative techniques that sample directly from scheduler space, bringing the prospect of scalable alternatives to standard numerical algorithms. The focus so far has been on optimising the probability of a pro...

متن کامل

Multistage Markov Decision Processes with Minimum Criteria of Random Rewards

We consider multistage decision processes where criterion function is an expectation of minimum function. We formulate them as Markov decision processes with imbedded parameters. The policy depends upon a history including past imbedded parameters, and the rewards at each stage are random and depend upon current state, action and a next state. We then give an optimality equation by using operat...

متن کامل

Optimal Control of Piecewise Deterministic Markov Processes with Finite Time Horizon

In this paper we study controlled Piecewise Deterministic Markov Processes with finite time horizon and unbounded rewards. Using an embedding procedure we reduce these problems to discrete-time Markov Decision Processes. Under some continuity and compactness conditions we establish the existence of an optimal policy and show that the value function is the unique solution of the Bellman equation...

متن کامل

Finite-horizon variance penalised Markov decision processes

We consider a finite horizon Markov decision process with only terminal rewards. We describe a finite algorithm for computing a Markov deterministic policy which maximises the variance penalised reward and we outline a vertex elimination algorithm which can reduce the computation involved.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003